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ABSTRACT 

A manual describing the RIC computer search program 
for retrieval of information from ERIC, CIJE, and other collections 
is presented. It is pointed out that two versions of this program 
have been developed. The first is for an IBM 360/370 computer. This 
version has been operational on a production basis for nearly a year. 
Four installations of this version have been made by RIC for the 
State Educational Agencies in Iowa, Kansas, Massachusetts, and Texas. 
The second version is for mini- computers using the BASIC programming 
language. This version, still in the developmental stages, is 
operational only on a Digital Electronics Corporation PDP-12 
time- sharing mini- computer. Numerous programming modifications to 
improve the performance of this version, as well as documentation of 
this program, are underway. A third version, a combination of the 
other two versions appears to be economically most attractive. This 
"hybrid” approach uses a 360/370 computer for performing the logic 
and the mini-computer for printing. The manual places emphasis on how 
to utilize the RIC program to perform searches; consideration of some 
technical aspects is given in the Appendices. The approach taken by 
RIC, utilization of the ERIC Descriptor Postings, provides one 
alternative to the computer search program, QUERY. (Author /CK) 
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FOREWARD 






The contents of this manual describe the RIC computer search pro- 
gram for retrieval of Information from ERIC* CIJE* and other co« lections* 

Actually* two versions of this program have been developed. The first 
Is for an IBM 360/370 computer. This version has been operational on 
a production basis for nearly a year. Four Installations of this version 
have been made by RIC for the State Educational Agencies In Iowa* Kansas *_ 
Massachusetts, and Texas. The second version is for mini -computers 
using the BASIC programming language. This version, still in the 
developmental stages* Is operational only on a Digital Electronics 
Corporation PDP-12 time-sharing mini -computer. Numerous programming 
modifications to Improve the performance of this version, as as 
documentation of the program, are underway at this time. However, a 
third version, a combination of the other two versions, appears to be 
economically most attractive. This “hybrid" approach uses a 360/370 
computer for performing the logic and the mini -computer for printing. 

Emphasis Is placed In this manual on how to utilize the RIC pro- ^ 

gram to perform searches; however, consideration of some technical aspects 
Is given In the Appendices. While emphasizing how to code searches, 
the treatment In this short manual cannot obviously cover all possible 
circumstances. A user of the RIC computer search program should contact 
the authors for 'assistance with regard to any problem encountered which 
Is not covered in this manual. 
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The approach taken by RIC, utilization of the ERIC Descriptor 
Postings, provides one alternative to the most widely-known computer 
search program, QUERY. After reading this manual, educators Interested 
In obtaining further Information about and/or Installing the RIC com- 
puter search program should contact either of the authors. 

The authors wish to thank the past North Dakota Title III State 
Coordinator, Vernon Eberly, and present Coordinator, Glenn Dolan, for 
the Title III support of RIC without which the development of the RIC 
search program would never have been attempted. The University of 
North Dakota Computer Center, Conrad Dietz, Director, deserves con- 
siderable credit for facilitating the development and operation of the 
IBM 360 version. The South Junior High School, Grand Forks School 
District, Computer Center, Walter Knlpe, Director, established under / 

i 

a Title III ESEA grant from the North Dakota Department of Public 
Instruction, deserves credit for sponsoring the mini -computer develop- 
ment effort. Finally, the authors wish to thank the Initiators of the V 

four Installations of the RIC computer search program, Alice Schafer 
of Mitre Corporation, Boston; Dorthy Mueller, Texas Department of 
Education, Austin; Richard Herllg, Kansas State Department of Education, 

i 

Topeka; and Mary Jo Bruett, Iowa Department of Public Instruction, 

Des Moines, for their many helpful suggestions for Improving the 
prog run. 
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CHAPTER I 

ERIC AND THE NEED FOR COMPUTER RETRIEVAL CAPABILITY 

Anyone familar with American education realizes that it is under- 
going a knowledge explosion similar to that occuring in the sciences. 

The contents of the explosion Include curriculum materials* instructional 
media* research reports* program descriptions* training materials* and 

numerous other types of information. 

Until a few short years ago (the mid sixties) this explosion in 
education threatened to get out-of-hand. However, ERIC made its appear- 
ance in time to forestall the threatened inundation. 

Briefly, ERIC is a system of twenty clearinghouses, each responsible /' 

i 

for searching out, reviewing, assigning terms descriptive of the content 
and abstracting pertinent fugitive literature in an Important domain of 
education. Documents and their abstracts selected for inclusion in ERIC V 

are forwarded to Central ERIC, a branch of the National Center for Educa- 
tional Communications of the United States Office of Education. Central 
ERIC in turn submits the materials from all clearinghouses to a contractor 
who microfilms documents which are not copyrighted. Each document appears 
on one or more microfiche; a 4 by 6 inch sheet of microfilm which can hold 
up to 72 pages. The contractor also prepares a monthly publibation, Research 
in Education , containing the abstract and bibliographic information for 
each document. This publication serves as an index to the ERIC collection 
using the descriptor terms assigned to the documents by the clearinghouses 
as the index terms. The clearinghouses also review the educational articles 
of over 550 journals. The articles are assigned descriptors so that tney 




can be Indexed similar to the documents in RIE* and the bibliographic infor- 
mation appears in a monthly publication ti tl ed Current Index to Journals, 
in Education; however, microfiche copies of the articles are not available* 
While any organization can subscribe to the indices and obtain copies 
of the ERIC microfiche, the ERIC system is not without its special problems. 
Most obviously, thousands of educators must be Informed of this system and 
its potential uses. Educators must then be provided ready access to ERIC 
materials, microfiche readers, and microfiche printers In order to maximize 
the utilization of the source documents on microfiche. Most Importantly, 
educators must be stimulated and motivated to develop professionalism 
similar to that of the scientist In order that they might continually seek 
the best education program possible. This Includes the effort to remain 
current within a specialty, something too few educators can yet boast. 

i 

It Is not the intent of this document to cover the basic problems 
of the ERIC system. The more narrow concern Is focused on the rapidly 
expanding holdings of ERIC and CIJE. As of July 1, 1972, ERIC had Indexed 
54,390 documents and CIJE had Indexed 45,271 articles and the respective 
yearly rates of expansion for these two collections were 12,329 and 17,671 . 
Any Individual desiring to conduct a search of the ERIC and CIJE holdings 
for Information pertinent to a particular topic has a major task even 
with the monthly Issues of RIE and CIJE, the yearly summary Indices, the 
thesaurus of descriptor terms and assuming the person has access to these 
Indices. As a means of lessening the burden on the Individual , the com- 
puter Is now being called upon to perform the clerical chores of searching 
the ERIC and CIJE collections. It Is the function of this document to 
describe one approach to computerized searching of the ERIC Information 



collections. 



Before considering further the RIC computer search procedures, the 
reader should note several definitions. Document refers to any single 
entry Into the ERIC Information collections. It Is even used to refer 
to an article in CIJE. Abstract specifically refers to the short des- 
cription of the contents of each document. For articles In CUE an abstract 
frequently does not appear. Resume refers to the abstract with appended 
bibliographic data, l.e., the entire entry for a document appearing In 
RIE or CUE. Document, abstract and resume are used synonymously through- 
out this publication. Finally, abstract or resume numbers are also termed 
as accession numbers. An accession number is a six-digit number prefaced 
by an ED for documents In RIE and an EJ for articles In CUE. 
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CHAPTER 2 

RIC AND ITS COMPUTER SEARCH PROCEDURE 

RIC (the Resource Information Center) was established September 1, 

1969, through a contract with the North Dakota Department of Public Instruc- 
tion using Title III ESEA administrative funds. During the first year of 
this contract, developmental work was undertaken by the Upper Red River 
Valley Educational Service Center. The two renewals of this contract have 
seen RIC operate from quarters on the University of North Dakota campus 
where ready access Is assured to a complete ERIC microfiche collection 
In the ERIC Center of the University Library. RIC also Is the recipient, 
commencing July 1, 1971, of a National Center for Educational Communlca- i, 

tlons, U. S. Office of Education, grant for developing a "local educational 
information center." The Grand Forks School District serves as the con- 
tractor for this grant. 

The purpose of RIC Is to provide, as an arm of the North Dakota Depart- 
ment of Public Instruction, a comprehensive "one-stop" source of educational 
Information for the educators of the State. RIC undertakes a number of 
activities Intended to foster awareness and utilization of Its services. 

These activities Include Inservice workshops, sllde/tape presentations, 
brochures, a monthly Newsletter, and follow-up contacts with users. The 
ERIC and CIJE collections serve §s th$ primary source of Information, 
however, RIC also uses the University Library’s holdings, ALERT (from 
the Far West Educational Laboratory) and a number of locally-developed 
collections. Special products of RIC Include brief reports summarizing 
the literature on critical educational topics and announcement of PREP. 
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RIC places emphasis on media specialists (librarians) as Its primary 
Information contacts with the educators of the State. Training efforts 
focus on this group In that RIC* s limited staff could not personally reach 
all educators. Regional centers, where microfiche reader/printers, copies 
of Research In Education , and RIC publications are available, have been 
established throughout the State within reasonably easy access to all 
educators until such time as their school districts provide microfiche 
readers for their use. 

Efforts of RIC to secure computer search capability date back to 1969. 

At that time It was noted In one of the ERIC clearinghouse newsletters that 
a copy of the ERIC computer tape could be secured on loan from Central ERIC. 

RIC In turn made a copy of this tape and Initiated an effort to develop a 

computer search procedure on an IBM 360/30 computer. > 

While progress was being made In reading the ERIC tape, RIC terminated 

development of this program later In 1969 when announcement of the QUERY 

** * • 

search program was received from Central ERIC. RIC subsequently became ^ 

the twenty-second Installation of the QUERY program. Unfortunately It 

was quickly found that QUERY was uneconomical considering several factors 

Including the limited capacity of the 360/30 with Its very slow (18 1/2 

Inches per second) tape drives. (However, this was a blessing In disguise v 

In that It halted what would have proved to be an equally uneconomical and 

costly development of software.) 

Casting around for an alternate approach to securing an economical 
computer search capability, the RIC director was Introduced by James Eller 
of Central ERIC to the publication, ERIC Descriptors - Term Usage Postings 
and Term Usage Statistics , produced by LEASCO, the computer contractor for 
ERIC. RIC was one of the first to order this publication. More Importantly, 
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the nature of this publication, an alphabetic listing of ERIC descriptors j 

with a numeric listing of all documents Indexed by each descriptor, caused 

i 

the RIC director to contact LEASCO with a request for a copy of the computer 
tape from which this publication was printed. (The first page of the 
descriptor postings appears on the next page.) RIC was the first non~ERIC 
agency to receive what Is now called the Descriptor Posting Tape or USEMAST. 

Reviewing the approach to searching the ERIC collection taken by QUERY 
and many of Its modifications, the ERIC resume or document tapes are directly 
entered by the search program. Descriptor terms for each document are 
directly checked to see If they meet the criteria set by the logic for each 
search Included In the batch of searches under consideration. Hits, or 
documents which meet the criteria, are stored on tape until a later time 
when they are sorted by search number and printed. While QUERY provides : / 

many nice features such as comparing on a descriptor prefix or suffix, j 

1 

this Is at the expense of hundreds or even thousands of comparisons when j 

It Is considered that there are an average of 10.5 descriptors assigned \ 

per document and a batch of searches may contain a hundred or more des- j 

crlptors. QUERY also frequently has a fairly complicated procedure for 

i 

coding searches Into computer usable form. - 

By limiting the search capability to the essential element, the ability 
to perform AND/OR/NOT logic, the use of the descriptor-posting tape can 
reduce the time required to perform a comparable batch of searches against 
the entire ERIC file to a fraction of that required by QUERY. Instead of 
entering each resume to check each descriptor In the batch of searches, the 
accession numbers of the resumes which are Indexed by each descriptor are 
read from the descriptor-posting tape, or from random access disk for even 
greater time sayings. The logic Is then performed on the accession numbers 
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FIGURE I 

A PAGE from the RIE Descriptor Postings or Inverted File 



DESCRIPTOR 



Abbreviat ton n 




i 


EDO <2 14 V) 


£0025274 


£0034171 


EDO 3 57 17 


£0049984 


KO057630 








ability 






EDOO 1617 


ED001731 


£0002237 


£0002573 


£00028 )0 


£0002860 


£0002942 


£000.1190 


E (1003284 








£000)294 


£0001610 


£000)683 


£00100)5 


£0010186 


£0010267 


£0010549 


ED0 106 35 


£001 1140 








EDO 11144 


£0011)45 


EO011929 


EDO 13762 


£0013063 


KD015678 


£0015795 


£0016060 


20016904 








ED017190 


KUO 17851 


ED0 17940 


ED0 180 3 7 


£0018115 


£0018117 


£0018326 


ED0 18574 


£0018713 








£0010899 


£0018998 


EDO 1909b 


EDO 197 19 


£0019762 


£0019882 


£0020126 


EDO 202 8 4 


£0020592 








ED020677 


£002137.1 


£0022023 


£0022844 


£0033469 


)D0?396B 


£0024 160 


£0025811 


£0026 1 17 








EDO 2717.1 


£0027583 


£0027883 


£0028931 


£00289)9 


£0029340 


£0031682 


£00)1742 


£0012009 








ED032124 


£00)2372 


£0032944 


£0033295 


£0033484 


£0034 128 


£0034217 


£00342 18 


£0034219 








80034219 


ED034274 


£0036011 


£0035061 


£00359.17 


£00)6774 


£0017784 


£0037805 


ED039423 








£0039438 


£00395)8 


E0039646 


£0039616 


£0040462 


£0041344 


ED042122 


ED042181 


unoo 


4 






£00426.12 


BD042807 


£0042984 


£0041087 


£0043189 


£0043769 


£0043804 


£0043893 


£0043922 


• 






£0044421 


£0044530 


£0044548 


E0044586 


£0045258 


£0045281 


£0045282 


£0045921 


£0048330 








ED048946 


£00490 4<i 


£0049289 


£0049861 


£0049068 


ED0S0.17U 


BOO 50805 


£0051296 


£0051309 








ED052352 


EO052B61 


£0055289 


£0059766 


EP010121 


BP010446 


BP012017 


FP000123 




Ability Q roil oina 






£0001063 


£0001111 


8000 1115 


£0001177 


£0001198 


£0001202 


£0001 207 


£0001209 


£0001214 








£000 1 2 Vi 


ED001219 


£0001222 


£0001227 


£0001315 


£0001372 


EDOO 1375 


£0001)01 


£0001404 








EDOO 1492 


£0001692 


£0001742 


£0001883 


"000:>036 


000020?) 


£0002101 


£0002108 


£0002114 








ED002254 


£0002320 


K, 1002770 


£00029.39 


£.3003061 


£0003249 


£0003358 


£00034 76 


£0003842 








ED010003 


ED0 10091 


£0010210 


£0010619 


£0010701 


£0011302 


B0011 344 


£0011673 


£0011816 








£0011909 


£001)708 


£0013710 


£0013842 


E0014088 


£0014812 


BO0 15194 


£0015878 


£0015974 








ED0 16603 


ED017002 


£0017048 


£0017329 


£0017425 


B0017520 


ED0 17885 


£0017937 


£001800.1 


. 






EDO18608 


£0019120 


£00 1918b 


EDO 19291 


EDO 19348 


£0019512 


ED0 19796 


£0020219 


E0020675 








£0020878 


£0020882 


£0020983 


E0021505 


EO 02 1506 


£0021709 


£0021864 


£0021865 


£002 19 15 








£0022253 


EDO 2 26 69 


EO023157 


£0023296 


£0025003 


£0025316 


£002581 1 


£0025842 


£0025944 








ED026 135 


£0026281 


£0027009 


£0027166 


£0027232 


£0028549 


£0029365 


£0029370 


£0010188 








E003150U 


ED031536 


£00)2444 


E0032637 


E0032638 


ED0329S1 


£0033047 


£0035154 


£0036824 






% 


EDO 396 13 


£0040513 


£0041092 


£0041623 


£0042173 


£0042271 


E00428UB 


EDO 4 36 17 


£0043692 








£0044435 


£0044525 


£0046318 


E0045780 


£0046921 


£0047077 


E0047415 


£0047861 


£0048381 








£0048382 


£0048383 


£0040384 


£0048969 


E0050094 


FD050143 


£0050187 


EDO SOI 96 


£0051330 








£0051418 


£0051419 


ED051638 


£0052228 


£0052260 


£0052282 


£0052567 


£00526)3 


EO053194 








SD054537 


ED05S838 


£0056056 


£0056150 


£0066)10 


EP011181 


£8000566 


£8001202 


£80016.17 








£8001 9 30 


£8002150 


BS002J46 


ES002573 












Ability Identification 






EDOO 1199 


£0001211 


£0001212 


£0001215 


C9001223 


£0001234 


£0001249 


ED001276 


£0001280 








£0001295 


E000132S 


£0001450 


ED00151S 


E0001938 


£0001621 


£0001643 


£0001648 


£0001702 








EDOO 1750 


£0001762 


B0001757 


£0001758 


£0001760 


£0001847 


£0001882 


EDOO 1885 


EO001919 








£0002093 


JJD0021V6 


£0002177 


£0002820 


£0002822 


ED002839 


£0002861 


E0002R64 


£000304.1 








£0003293 


E0003S95 


£0003858 


£0010088 


£0010206 


£0010238 


£0010420 


£0010502 


£0010535 








ED0 12166 


BD012612 


E 0012613 


£0013103 


BO0 134 11 


£0013516 


£0013518 


ED0 146 53 


£0015033 








ED015506 


£0015783 


ED0 16 1)7 


£0017004 


ED0 17475 


£0019119 


£0020038 


EDO 200 86 


£002080’ 








ED023232 


ED023295 


£0023469 


£0023722 


£0023760 


£0024716 


£0025209 


£00258 1 1 


£0026116 








E0026562 


£0026072 


ED027796 


£0027801 


£0028431 


£0028936 


£00.12124 


£0031138 


£0033223 








EDO 34073 


£00)4265 


EDO 14 35 3 


£0034807 


£0035700 


£0037784 


£0030681 


£00 387.17 


EO019577 








ED040351 


£0041063 


£0041261 


£004 14 1 7 


£0041896 


£0042927 


£0043395 


£0043683 


£1104)684 








£0044427 


ED044433 


£0044745 


£0045687 


E0046976 


£0047009 


£0047231 


£0048301 


£0048703 








E0049046 


ED049326 


ED049343 


£0050165 


£0050196 


£0051419 


£0051428 


£0052228 


EO052376 








ED0S3156 


£00531 70 


EO053333 


£0053343 


£0054340 


£0055099 


£0055259 


£0056057 


ED0S6219 








ED057162 


ED057222 


£0057343 


£0060178 


EPO 11 501 


EP01189) 


EPO 12026 


E8002132 


£8002176 








E8002269 


ESI 001 76 


FR001037 














Able Students 






EDOO 1149 


£0001164 


EDOO 1201 


£0001205 


E0001208 


ED001214 


£0001233 


ED001 105 


£0001388 








EDOO 151 9 


EDOO 1764 


ED001779 


ED001084 


E0002102 


£0002 165 


£0002982 


£0003224 


£000.1286 








ED003406 


ED00343R 


ED0 10316 


£0010317 


£0010318 


£0010533 


BO0 10766 


£0010998 


£0011002 








ED0 11 238 


ED0 1 1 3 50 


£0011857 


£0011979 


£0012246 


£0012580 


£0013516 


E0013752 


ED014308 








ED014422 


EO014475 


ED0 14486 


£0015123 


EDO 156 97 


ED01S878 


£0016664 


£0017019 


£0017473 








EDO 17866 


ED0 17937 


EDO 18504 


£0018505 


ED019076 


£0019251 


BD01931S 


£0019693 


£0019918 








EDO 20046 


ED020288 


E0021928 


ED023609 


BD024590 


£0024605 


£0026248 


£0026201 


£0030672 








ED031549 


ED032444 


ED033 1 12 


£0036245 


£0036515 


£0037 522 


£0038664 


£0042806 


£004 32 85 








EJ043286 


£0044076 


EO045038 


£0046321 


£0048819 


ED049684 


F.D051741 


ED0SS543 


£0055565 








EP010004 


EPO 11145 


BS000653 


E8000811 


E80009S1 


£8002132 


£8002232 


£8002269 


ES002270 








ES002410 


ES002416 


MP000616 






•. 








Abortions 






ED0S4042 


ED055869 


ED050128 














Abstracting 






ED003474 


ED003776 


ED003805 


£0003806 


ED003813 


£0003844 


£0010298 


£0010681 


£0010840 






ED014229 


ED0142S6 


ED0 1500 3 


£0016913 


£0017282 


£0017293 


£0017303 


£0017311 


£0017747 








ED020489 


£0023409 


ED02528J 


£0026083 


£0027919 


£0029651 


£0029670 


£0029671 


£0033379 








ED033722 


£0039887 


£0040724 


ED042716 


£0043798 


E004S170 


ED04646S 


ED047740 


£0048899 








ED04890S 


ED048906 


SD049799 


ED050739 


£0059800 


£0052803 


EDO 56016 


£0058893 


BD059204 








ED059412 


EO059753 


EPO 10922 


EPO 11559 


FR001187 


FR001189 








Abstraction Levels 






EDO 16651 


ED02230S 


£0023220 


£0024169 


£0024466 


£0024471 


EDO 25088 


£0026752 


£0026800 








ED027317 


E0027513 


ED028004 


ED028174 


ED028200 


ED028378 


£0029891 


EDO 302 54 


E003258S 








ED033930 


ED034271 


E 0034732 


ED03S44 1 


ED035620 


ED036S76 


EDO 37234 


£0039190 


ED040176 








ED0404S9 


ED040767 


ED040833 


BD042261 


EO042764 


£0043084 


ED04339S 


ED04S626 


£0046196 








ED047897 


ED048683 


ED049002 


ED0496 17 


E0050025 


ED0S0SS6 


ED0S0S69 


£0051 036 


ED0521S8 








ED055447 


















Abstraction Tests 






ED012132 


ED0 165 28 


B 0020018 


ED021608 


ED021645 


BD026752 


ED029715 


£0042806 




Abstract Reasoning 






ED0 12 132 


EDO 13665 


EDO 14320 


ED0 175 19 


ED0 17693 


£0017694 


ED01769S 


ED017696 


E0017697 








ED017698 


ED017699 


£0019285 


ED019346 


ED019623 


£0020010 


ED020078 


ED022305 


£0022384 








ED023214 


ED023231 


EDO 2 354 2 


EDO 235 4 3 


ED023725 


EO024182 


EDO 24325 


£0024466 


ED024558 








EDO 25093 


ED02S20S 


ED026131 


ED026950 


ED028831 


ED030538 


E00325B5 


ED033854 


£0034693 








BD034756 


£00)4819 


ED035496 


ED035620 


£0037462 


£0040342 


80041619 


EDO 42806 


£0043659 
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assigned each descriptor rather than on the descriptors themselves , a 
considerably faster process. The "hits" for each search are next sorted 
Into numeric order. Finally the resume tapes are entered, but only to 
compare each resume number with the next number In the ordered list of 
hits. When a match Is found, the result Is either printed or stored on 
a work tape until such time as the hits can be sorted back Into order by 
search number. 

Figure 2 on the following page reveals that the RIC search program 
actually functions as two distinct parts. The first part accepts a batch 
of searches and alphabetizes the descriptors. Next, by searching the 
descriptor-posting tape for the alphabetized descriptors, a list of 
accession numbers, which contains all the possible hits for each search, 
Is obtained. Then the logic assigned to the descriptors Is performed. 
Finally, the resultant hits are both printed and stored on disk by search 
numbers. 

Before entering the second part of the RIC search program, It Is 
possible to review the hits to see If they fit the question asked. This 
Is a distinct advantage over QUERY where the results are seen only after 
the resumes are printed. It Is possible to enter a number of options 
providing for the printing of only those resumes desired. The secon* 
part of the RIC search program applies these options to the lists of hits 
stored on disk. The resume tapes are then entered to find the desired 
resumes. These resumes are either directly printed or stored on disk or 
tape for sorting In order by search number prior to printing. 

Besides the printing options, a number of other options as to page 
size and length, number of resume tapes and accession numbers on each 
tape are built Into the program. In Appendix A will be found copies of 
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FIGURE 2 

Diagram of Resource Information Center Computer Search Program 
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the documentation for these options as the documentation appears In print 
outs from the IBM 360 version of the program. 
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CHAPTER 3 

CODING SEARCHES FOR COMPUTER INPUT 

Essential to utilizing any ERIC computer search program Is an under- 
standing of the techniques for coding searches. Figure 3 Illustrates 
the elements of a search coded for the mini -computer and for the 360/370. 
version of the R|C computer search program. An understanding of only _ 
four elements Is required to code searches; 1) search number, 2) end 
of search symbol, 3) descriptors, and 4) logic-parentheses. The first 
three elements are self-explanatory; only the latter element will be 
considered In this chapter. 

Before detailing the coding procedures for the RIC computer search 
program, the reader should become thoroughly famllar with the Thesaurus 
of ERIC Descriptors. It Is essential that anyone using the RIC computer 
search program, or any other search program, select terms from the Thesaurus,. 
Spelling must be carefully checked since as simple a matter as leaving 
an “s“ off of a descriptor will Invalidate the particular search of which 
the descriptor Is a part. The remaining searches In the batch, however, 

will be processed In the normal fashion. 

Coding for the Mini -Computer Search Program 

Assembling a search for mini -computer Input Is a very simple pro- 
cedure. The coder should use paper marked off In 48-space lines for 
teletype Input as shown In the example In Figure 3. A search may Involve 
one or more lines of Information. Descriptor terms may be broken In the 
middle of a word or may be entirely contained on a single line since 
spaces are Ignored by the computer. However, If a word Is broken, be 
sure that a hyphen (-) Is not used as would be the case with printed matter. 
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Examples of Search Coding Elements 



After each descriptor term one of the following five symbols must 
appear to tell the computer program how to process the descriptor: 

% apply OR logic 

* 

& • apply AND logic 

$ apply AND logic to two or more descriptors joined by OR logic 
# apply NOT logic 

$ end of search. Must be used or the search will not terminate. 

Coding for the 360/370 Computer Search Program 
The 360/370 version of the RIC computer search program should be 
coded on standard 80-column data forms for punching IBM cards. The first 
four columns are reserved for the search number. A search may Involve 
more than one line of Information. Since spaces are Ignored, descriptors 
may be broken at any point or contained entirely on a line. Again, do 

not use a hyphen (-) when breaking a word. 

The following three logic symbols are used between descriptors: 

.OR. 

.AND. 

.NOT. 

The periods are essential . 

The dollar sign ($) Is used to signify the end of each search. 

Examples of Coding logic 

Several examples will be provided to demonstrate the application 
of these two search programs. The mini -computer search will appear on the 
left-hand and the 360/370 version on the right-hand side of the page. 
Descriptor terms will be assigned letters of the alphabet In these examples. 
1. A % B $ A .OR. B $ 

Application of OR {%) logic results In the combining of posting lists 
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for these two descriptors. Duplicate numbers are retained only once. 

For example, the following postings would produce the results shown: 

A ED000667 ED000984 ED001856 

B ED000875 EDQ00984 ED001122 

Result ED000667 ED000875 ED000984 ED001122 ED001856 
Note that the dollar sign ($) appears as It must as the termination of 
the search for either version of the RIC computer search program. 

2. A & B $ A .AND. B $ 

Application of AND (&) logic results In only numbers common to the 
postings for both A and B being retained. In the example given above, 
only ED00984 Is common to both descriptors. Thus, only one number would 
result from performing this search. 

i 

3. A % B & C $ 'a .OR. B .AND. C $ 

Complex searches can be written by simply combining the AND (&) 

and OR (%) logic of examples 1 and 2. It should be borne In mind that 
the program conducts the search from left to right. Thus, the computer 
reads the OR (%) symbol combining A and B as In example 1 before "seeing" 
the AND (&) logic symbol. It Is the combination of A and B which the 
computer compares to C for numbers common to both as In example 2, rather 
than comparing B to C as might seem to be the case when looking at the 
example. 

4. C & B % A $ C .AND. B .OR. A $ 

It will be noted that this Is simply example 3 written In the reverse 
order. However, the result of applying the logic will not be the same. 
Note that when working from left to right the program first Identifies 
numbers common to the postings for both C and B. Next It combines the 
common numbers with all the numbers In A. This Is certainly not the same 

as for example 3. 

o 

ERLC 
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C .AND. B .AND. A $ 



5. C & B & A $ 

On first glance this might appear to be the answer. However, working 
from left to right reveals that the numbers common to C and B are compared 
to A for numbers also common to It. The final results are numbers common 

to the postings for all three descriptors. 

What Is required is In effect a set of parentheses around the latter 

two descriptors such asC&(B$A)$ 

6. C 0 B # A $ C .AND. (B .OR. A) $ 

The 0 symbol serves to Indicate the presence of a set of parentheses 
In the mini-computer version; the 360/370 program accepts the actual 
parentheses up to five levels. Two or more descriptors connected by the 
OR (%) symbol always follow the 0 symbol. The 0 symbol signifies that 
the postings for each of these two or more descriptors will be compared 
using AND (&) logic to the postings of the descriptors proceeding the 
symbol. The parentheses for both versions of the RIC computer search 
program are terminated by another AND (&) or a dollar sign ($) symbol . 

In the example, first B and next A are compared to C using AND (&) logic. 

7. A%B0C#D$ A .OR. (B .AND. (C .OR. D) ) $ 

This example Is comparable to the previous one with the exception 

that the postings for descriptors A and B are combined Into one list 
first before reaching the AND (0) symbol. 

8. A085SC&D$ A .AND. (B .OR. C) .AND. D $ 

In this example, only those numbers In A which are either In B or 

C are retained since the 0 symbol operates like a set of parentheses used In 
the 360/370 version. Remember that 0 terminates when an & or $ symbol 
Is reached; In this example, the & symbol after descriptor C. What now 
happens Is that the numbers which have been retained so far are compared 
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to the postings for descriptor D and only those numbers common to both 
lists are retained as hits. 



In this example the postings for descriptors A, B and C are combined 
into one list. Next, matches between this list and D are sought and 
written as a new list. This new list is compared In turn to E for matches 
and another new list Is written. Finally, this list Is compared to F 
for matches and the final hits are Identified. 

Now this example Is somewhat artificial In that rarely does a search 
Involve over two levels of AND logic. The obvious reason Is that numbers 
common to more than two posting lists usually do not exist unless the 
descriptors are practically Identical In which case only one of them would 
normally be used. The main point of this example Is that no parentheses 
are required for AND logic In the mini-computer version while they are 
for the 360/370 version. 

10. A Si B Si C 4 0 % E & F $ ((A .OR. B .OR. C) .AND. 0) .OR. E .AND. F$ 

This example Is Identical to the proceeding one except for the OR (%) 
logic symbol following descriptor 0. However, this % creates the most 
serious coding problem for the mini -computer version yet encountered 
since there are two other ways In which this example could be written, 
each of which results In different hits. 

10a. AXBSJC0D2E&F$ ((A .OR. B .OR. C) .AND. (0 .OR. E)) 



9. A2B«C&D&E&F$ 



((A .OR. B .OR. C .AND. 0) .AND. E) 
.AND. F $ 



10b. A * B * C & D $ and 
E & F $ 



.AND. F $ 

{(A .OR. B .OR. C) .AND. D) .OR. 
(E .AND. F) $ 



o 
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For all three examples the postings for the first three descriptors 
are combined into a single list. Now consider in turn what each of the 
three examples will do next. 

For example 10 those numbers common to D and the combina tion of A» B 
and C will be combined with the entire posting list for E and compared to 
F for common numbers. 

For example 10a those numbers common to the merger o f D and E and the 
combination of A. B and C will be compared to F for common numbers. 

For example 10b, those numbers common to D and the combination of, A, 

B and C will be considered to be hits. Also the numbers common to E 
and F will be considered to be hits. In effect, two separate searches 
have resulted for the mini -computer version, but under the same search 
number. (It is possible that duplicate abstracts would be printed by 
this approach, however, a simple modification to the programs would remedy 
this situation If It proves to be a serious problem.) 

Consider the following simple example to see what happens. (Hits are 
Indicated by the boxes.) 



Descriptor 


Postings 


Example 10 


Example 10a 


Example 10b 


A 


1,5 


% 


% 


% 


B 


2,4 


1,2, 4, 5 

% 


1.2, 4, 5 

% 


1,2 ,4, 5 

% 


C 


3,4 


1,2, 3, 4, 5 
& 


1,2,3, 4 ,5 
& 


1,2, 3,4, 5 
& 


D 


0,2, 3, 6 


2,3 

% 


(2.3) 

% 


2*L, 

i 


E 


1,7,8 


1,2, 3,7 ,8 
& 


2,3,4 

& 

m 

$ 


& 


F 


1,2, 5, 8 


00 

CSJ 

r* 


As Is very 


apparent, 


the coder must be sure to understand exactly 


what It Is he wishes the OR (%) to do. 


Even more Important the coder 
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must understand the search request which he Is attempting to code. 

Remember that the program processes from left to right. When coding a 
search, it is recommended that you think from left to right as the computer 
will do in order to determine whether the correct logic symbols have been 



assigned to the search. 

11. A % B # C $ (A »0R. B) .NOT. C $ 



This brings us to the third type of logic, NOT (#). The NOT logic 
means everything will be processed except what is In the descriptor or 



set of descriptors following the NOT. 
below. 

A EO 000567 ED 001123 

B ED 000967 ED 001197 

C ED 000868 ED 001123 

Note that combining A and B results in 
contained under C. When the NOT logic 
and the result Is: 

ED 000567 ED 000967 



Consider the postings listed 

ED 001535 
ED 001268 
ED 001268 

six numbers, two of which are 
Is applied, these two are dropped 

ED 001197 ED 001535 



The NOT logic can be used as many times as Is desired In a given search. 

Mini -Computer Summary 

It will be noted that the mini -computer version of the RIC search 
program Is not quite as flexible as the IBM 360/370 version. Multiple 
levels of parentheses, coding of searches entirely In English language, 
and sorting of abstracts In order by search number are all not possible 
at this stage In the development of the mini-computer version. However, 
the mini -computer program does have the capability to enter searches at 
any time; a batch of searches then consists of all searches entered since 
the last run of the logic program. 
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Particular attention should be given to the limited parentheses capa- 
bility. Only one level of parentheses Is possible using the 0 symbol. 

This symbol Is used when It Is desired to combine two or more descriptors 
and compare the results with one or more other descriptors using AND 
logic. 

It should also be noted that the mini-computer version still requires 
considerable programming. For example, additional levels of parentheses, 
more print options, rewriting portions of the program In assembler language* 
and several other revisions should result In an even more cost-effective 
program. 

IBM 360/370 Version 

The 360/370 version Is a more complete and better documented (see 
Appendix A) program. The program has an especially good set of diagnostic 
error messages. The only area requiring additional emphasis would appear 
to be the matter of parentheses. As Figure 4 on the following page Indi- 
cates, there are two specific purposes for utilizing parentheses. First, 
parentheses are used to group sets of descriptors when two or more AND or 
NOT logic terms are used. A set of parentheses Is required around each 
pair of Individual or sets of descriptors connected by an AND or NOT logic 
term. The second use Is to Indicate sets of descriptors which are connected by 
AND or NOT logic. It Is Important to check that the same number of right 
and left hand parentheses have been used. 




24 



FIGURE 4 



Two Uses of Parentheses In the IBM 360/370 Version 

1, To INDICATE A GROUP OF TERMS WHICH ARE TO BE ANDED OR NOTED TO 
ANOTHER TERM OR GROUP OF TERMS. 

Correct: A .AND. (B .OR. C) 

Incorrect: A .AND. B .AND. C 

A .AND. B .OR. C 

THE FIRST INCORRECT METHOD IS TOO RESTRICTIVE. IT REQUIRES ALL 
THREE TERMS TO BE PRESENT BEFORE A HIT RESULTS. 

The second incorrect method performs A .AND. B correctly; however 
C is ORed to the result. Therefore/ all the postings under C ' 

ARE ADDED TO THE HITS FOR A .AND. B. 

2. To GROUP THF. TERMS WHEN TWO OR MORE ANDS ARE PRESENT. THIS IS 
AN IDIOSYNCRACY OF THE PROGRAM; NOT SOMETHING WHICH IS OBVIOUS. 

Incorrect: A .AND. B .AND. C 
Correct: A .AND. (B .AND. C) 

Correct: (A .AND. B) .AND. C 

Note that these parentheses can be placed anywheres. If three 

OR MORE ANDs APPEAR IN RARE INSTANCES/ NOTE THAT SETS OF PAREN- 
THESES ARE REQUIRED WITHIN PARENTHESES. 

A .AND. (B .AND. (C .AND. D)) 

Combinations of thes^ two uses of parentheses often occur, 

A .AND. (B .AND. (C .OR. D» 

A '.AND. ((B .OR. 0 .AND. (D .OR. E» 

The important caution to note, always check that the same number 

OF LEFT AND RIGHT PARENTHESES ARE USED. 
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CHAPTER 4 



HARDWARE CONFIGURATIONS WITH COST ANALYSES 

Hardware configurations will be considered under two headings: 
Configurations for the IBM 360/370 and Configurations for the Mini- 
computer. 

Configurations for the IBM 360/370 

When confronted by the unrealistic operational costs of the 
original QUERY program, the Resource Information Center initiated an 
effort to secure an alternative to QUERY. The RIC director had / 

previously considered the possibility of batching individual searches 
to be run against the inverted file computer tape, which is actually 
the magnetic tape for preparing the familiar descriptor postings. The _V 

initial set of RIC programs consisted of segments in Fortran to perform 
the logic and in Assembler to read the inverted file and abstract tapes. 

As time allowed, the Fortran segments were rewritten in Assembler 
resulting in a considerable cost savings and improved operating 
flexibility. 

The basic operating specifications for these programs revolve 
around breaking down the logic for each search in the batch into its 
component parts and processing the results against all the ERIC and/or 
CIJE postings for each of the descriptor terms used in the search logic. 

At that point a listing of the “hits" resulting from each requested 
search is printed. This listing of hits for each search may be then 
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utilized in several ways including: 1. generating ERIC abstracts 

with equipment such as the Remington Rand REMCARD, and 2. editing prior 
to abstract generation, including limitation of abstract generation to 

either current or historic information. 

Being extremely efficient and flexible, the RIC search programs 

can be installed on any of the IBM 360 or 370 computers from a 64K 
model 30 upwards. Since the IBM supported sort routine is incorporated 
as an operational part of the RIC search and retrieval strategy, and 
since this program is the largest user of core memory, it may be safely 
assumed that any IBM 360/370 series computer capable of supporting the 
IBM sort routine will also be capable of supporting the RIC search 

strategy. 

The Dedicated IBM 360 or 370 System 

A dedicated 360 or 370 system would tend to be least expensive 

when using the smallest computer, i.e. 360/30. The 360/30 CPU, for 
example, is apparently able to handle the searches at a lower cost, how- 
ever, a much slower speed, than the 360/40, /50, /65, etc., which are 
much faster, yet more costly, given the same tape and disk drive 
speeds and capabilities. In almost all cases the greatest economy of 
operation will result from use of the highest speed tape drives and the 
largest IBM 2314 disk drives. At least one tape drive and two disk 
drives are recommended; however, a modification of the program is 
available which can operate on magnetic tape alone. In this case at 
least three tape drives are recommended. 
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Time Sharing and the IBM 360 or 370 System 

The time sharing configeration Is the most difficult for which 
to determine a least cost analysis. It is assumed at this time, based 
upon data obtained from the IBM manuals, that the largest of the IBM 
360 or 370 computers should be used for the least cost time sharing 
operation of the RIC computer search program. Ideally the same 2314 
disk drives and the highest speed, highest density, tape drives, should 
be available in the same numbers as for the dedicated system. 

Cost Analysis for the IBM 360 or 370 System 

Providing cost analysis data is a complicated proposition 
because of the many factors Involved. For example, Internal to the 
program such factors must be considered as: number of searches in a 
batch, number of accession numbers or postings to be processed, 
number of "hits" obtained, number of hits to be printed, etc. External 
to the program such factors must be considered as the model of the 
computer and the billing procedure and cost. 

An example will be considered having 35 search requests 
resulting in 2950 “hits.” This is a medium sized batch since the 
number of searches in a batch can range from one to over a hundred 
depending upon the computer model and the factors internal to the 
program. It took 10 minutes and 46 seconds for the logic part of the 
program to be performed on an IBM 360/40. At a billing cost of $55 
per clock hour this part of the program was run for $0.28 per search 
request. At this point users with a REMCARD system would terminate 
the program. Most users, however, would utilize computer printing of 
the resumes. For printing the 2950 hits, 112 minutes were required 

o 
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using the slowest IBM tape drives. Thus, for an additional $2.93 per 
search the requesters obtained the printed abstracts sorted— in order 
by search number. 

Note that if billing was based on CPU rather than clock time 
the cost per search would likely be less depending upon the rate 
charged for CPU time. Also note that a fast tape drive would reduce 
the cost per search proportionately. 

Configurations for the Mini-Computer 
The development of the RIC computer search program was also 
undertaken on a Digital Electronics Corporation (DEC) PDP-12 computer 
(Basically the PDP-12 is nothing more than a PDP-8 with analog capa- 
bility). The only significant departure from the IBM 360/370 version 
is that presently no computer sorting of resumes by search number is 
possible. The particular computer used (South Junior High School, 

Grand Forks Public Schools, as part of a Title III ESEA project) 
has disk, tape, and printer hardware as well as time-sharing capability. 
However, this amount of hardware is not essential as has been proven 
and will be described in Appendix A under several of the hardware 
configurations to be considered. Also, this program should operate on 
any of the many mini -computers supporting the BASIC language with but 
minor modifications. 

The computer search program operating on the time-sharing PDP-12 
computer will be described in this chapter; the following four additional 
configurations will be described in Appendix B. 

a. Minimual stand-alone system. 

✓ 

b. Ideal stand-alone system. 
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c. Addition of hardware to minimual existing system. 

d. Addition of hardware to ideal existing system. 

For each of these configurations the costs for equipment and 
maintenance were taken from the latest DEC price list. Some cost 
savings could be realized by going to peripherals supplied by inde- 

■V ■ 

pendent companies; however* more problems with maintenance and 
software should be anticipated. 

The cost analysis figures are based on the following hypothetical 
situation: 

1. Amortization of the computer system over a five year period. 

2. One hundred (100) search requests a month; 6,000 over the 

five-year period. 

3. Searches run in batches of ten for each operation of the 

program. 

4„ An average of 100 resumes printed per search, each resume 

averaging 35 lines of 60 characters in length. 

5. The terminal time required to process an average batch 
under stand-alone operation is 382 minutes when a disk is available and 
465 minutes when only magnetic tape is available. Under time sharing 
with five other users the terminal time when a disk is available is 
408 minutes while with only magnetic tape it is 502 minutes. All time 
estimates are based on the further assumption that a 350 line per 
minute printer and a 45 inch per second 800 bits per inch IBM compatable 

magnetic tape drive are available. 

6. One shift results in 40 hours per week for 50 weeks per 
year, resulting in 10,000 hours of terminal time over the five-year 
period. Running 600 batches of 10 searches in that time in stand-alone 
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operation with disk available requires 38.2 per cent of the terminal 
time while with only magnetic tape requires 46.5 per cent of the 
terminal time. Under time sharing 'the respective percentages are 40.8 
and 50.2. 

The reader will note that cost per search figures are given for 
both shared and dedicated use of the respective systems presented in 
this chapter and in Appendix B. (It might be possible in many 
educational settings to secure access to a mini-computer during evening 
and nighttime hours when it is not normally being used; resulting in a 
cost less than even that for shared operation.) It must also be 
emphasized that these cost per search figures cover only computer 
expenses; the time required to code the search for the computer and the 
time required to process the resultant resumes following printing by 
the computer Are not included. 

The time-sharing configuration is the hardest for which to 
present cost analyses since, as with the 360/370 version, so many 
factors enter into the picture. First is the factor of how many users 
are served; the time sharing PDP-8 can handle up to 16 (32 with some 
minimual additional hardware). The assumption will be made that the 
system is handling 6 users, each utilizing the system 40 hours per 
week. The second factor is how many 4k memory partitions are available; 
the more available the more users can be utilizing the system simul- 
taneously. The assumption will be made that 2 partitions are available. 
Third, the magnetic tape drive cannot presently be operated under time 
sharing; it is assumed that time sharing is shut-off during the second 
shift and that the dedicated use of the system does not add additional 
cost to the computer search application. 






Sil 07,500 Initial hardware costs 
32,500 Maintenance contract for 57 months 

72.000 Two shift operator ($600 per month/operator) 

10.000 Overhead 

$222,000 Total 

+ 24,000 Lease of six teletype units 
$246,000 Total 

x .1667 One sixth cost of total system 

$ 37,000 Total cost for total operation of terminal 

x 40.6 Percentage of terminal time required to run searches (see 
point six above) 

$ 16,630 Total cost for computer time 
* 6,000 Number of searches over the five year period 

$ 2.80 Computer time cost per search 

.30 Cost for paper on which the resumes are printed 

$ 3.10 Total cost per search on a shared basis 

Assuming a dedicated terminal were to be required the cost 
analysis would be as follows: 

$222,000 Total cost of configuration e. 
x .1667 One-sixteenth cost of total system 



$ 37,000 Total cost for computer time 
+ 4,000 Lease of teletype unit 

41,000 Total 

* 6,000 Number of searches over the five year period 
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$ 6.83 Computer time cost per search 

+ .30 Cost for paper on which the resumes are printed 

$ 7.13 Total cost per search on a dedicated basis 

Obviously the cost per search figures will flucuate with the 
number of users; the eventual addition of sixteen users paying for the 
hardware will reduce the cost per search in half. If the magnetic 
tape units can be operated under time sharing, additional cost 
reductions should be possible. Finally, the refinements of the mini- 
computer version now underway will result in further reduction in the 
cost for performing searches. 

An interesting approach is to combine the two computer program 
versions assigning to each the task it does most cost-effectively. 

The 360/370 version would be assigned the task of performing the logic 
while the mini-computer would be assigned the task of printing. The 
result is a cost per search of less than $2*00; significantly better 

than either version of the program can do alone. 

In conclusion the reader should note that many factors must be 
taken into account when estimating computer costs. Several factors 
which have not been considered will now be covered briefly. First, 
what is the optimum sized batch for most economical operation? Number 
of descriptors, number of postings for the descriptors, type of logic 
used, number of hits to print, etc., are all variables to be considered. 
The cost per search actually drops In logarithmic fashion, i.e;, the cost 

" * u 

for a batch of one or two searches Is very high but the cost levels off 
fast until after twenty or more searches very little savings result. 
Second is the number of searches run. The figure 6,000 has no special 
meaning; actually it would appear that more searches would result in 
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a lower cost per search. Third is the five-year amortization period; 
there is no reason why this period could not be longer or shorter 
resulting in varied cost per search estimates. Fourthi 100 resumes 
per search will be considered excessive by many. By appropriate 
choice of logic and descriptors a reduced number of hits Is possible 
resulting In significant time and cost savings. Fifth, greater or 
lesser utilization of the computer system than the 40-hour week will 
result In appropriate modifications In the cost per search figures. 



CHAPTER 5 

ADDITIONAL CONSIDERATIONS 

This concluding chapter will review briefly the following three con- 
siderations: 

1. Printing resumes by search number. 

2. Creation of Inverted files as Input to the logic program. 

3. Inclusion of locally-generated information collections to be 
searched. 

Printing Resumes by Search Number 

A consideration the user of this computerized ERIC search procedure 
must make Is the printing of resumes In order of accession numbers or In 
order of search numbers. The first option results In the necessity to 
hand sort the resumes--a time-consuming but usually less expensive pro- 
position than computer sorting. 

Users who feel the expense of computer sorting warranted have three 
obvious alternatives. First, anyone having access to the QUERY program 
might use the QSORT portion of QUERY. It should be possible to modify 
QSORT, however, the authors of this report have not tried this modifica- 
tion. The second alternative Is to store the output from the abstract 
printing program onto a work tape or disk Instead of going to the printer. 
A simple routine can then be used to read the work tape or disk once for 
every search, printing only those resumes numbered the same as the search 
being processed. The third alternative also Involves writing the output 
on tape or disk. However, the output Is then sorted on three dimensions, 
1) search number, 2) accession number, and 3) line number of the material 
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to be printed. It Is then possible to print the output In order by search 
and accession numbers. This modification has been tried and appears to 
be the most cost effective— only costing approximately 30 cents per search 
when processing a batch of 70 searches with 5,000 hits to be printed. 

Creatl On of inverted FIT es 

Creation of an Inverted file, such as USEMAST supplied by LEASCO for 
the ERIC resumes, Is not an unduly complicated concept or task. An In- 
verted file Is simply a one-dimensional search for every term In a selected 
field with the results being alphabetized or otherwise ordered. For 
example, for USEMAST the “given field" Is the descriptor field. For each 
descriptor all the documents which are Indexed by the descriptor are 
Identified and the accession, or ED, numbers are listed after the des- 
criptor as If a one-dimensional search had been performed. The descriptors 
are then alphabetized and the result Is USEMAST. 

Users Interested In special applications of the ERIC data base, such 
as the ability to perform searches by Institution, author, etc., can create 
Inverted files for these fields. The results can be added to USEMAST so 
that the user could, for example, search for all the special education 
(descriptor) materials produced by a given Institution of higher education. 

Locally Generated Information Collections 

Another possibility available to the user of this ERIC search program 
Is the Inclusion of other Information data bases other than ERIC and CIJE. 
The requirement for adding data bases to the system Is solely that the 
Information must be In computer readable format, l.e. , on magnetic tape 
or disk. The user then turns to the Inverted file generating program 
described In the proceeding paragraph and creates an inverted file on 
the one or more fields of Interest for searching purposes. 
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The Input data for the fields of Interest being utilized for develop- j 

ment of locally-generated inverted files must be consistent throughout j 

that entire file. For example if an Inverted file were to be generated ; 

for year of publication, then all input data must be consistent in 
having a four-digit number for year of publication (1950, 1951, etc., 
not 1950-51 , 51 , etc.). 

Once the Inverted file is created, the RIC logic program can be 
used to search for Information. Identical coding procedures are followed 
as were shown in Chapter 4. The results can be printed using the resume 
printing program; however, some limited modifications might be required 
in order that the user can specify exactly what it Is desired to print. 

The Resource Information Center has developed a general purpose 
program for generating Inverted files on most any information data base. / 

Further information about this program and its utilization can be obtained 
by contacting either of the authors of this publication. 

y . I 
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APPENDIX A 



DOCUMENTATION FOR OPTIONS IN RIC IBM 

360/370 COMPUTER SEARCH PROGRAM 

The user of the RIC computer search program should find it to be 
quite well -documented. The programmer has Included a brief explanation 
for every major section to describe exactly what the program is doing 
at that point. 

Actually the ‘'program" consists of the following five programs 
in two parts; the first performing the logic and the second printing 
the "hits." 

PART PROGRAM NAME 

I Logic DATETIME 

ERICCARD 

ERICUSE 

ERICLOG 

II Printing ERICABST 

The documentation occurring at the beginning of each of the five 
programs consists of the following elements: 

1. Identification of the program. 

2. Function : a brief description of what the program does. 

3. Program Options (optional) describing what user specified 
options are available. 

4. Entry Points . 

5. Macros Used. 



6. Data Sets Used , 

The documentation for each of the five programs will be found 
for reference purposes on separate pages headed by the program names. 







Program Name: DATETIME 



Identification 



System 
Program Name 
Sponsor 
Programmer 
Installation 
Machine 
OP System 
Date Written 



ERIC Retrieval System 
DATETIME 

Resource Information Center 
Lee Brueni 

U of North Dakota Computer Center 
IBM 360/40G 
OS/PCP 
May, 1972 



Function „ . 

“IJFtain and return to the Calling Program the date and time 

in edited form (If. MM/DD/YY, HH.MM.SS). 



Entry Points 
DATETIME 



Entry Point to Program 



Macros Used 

Time 

Save 

Return 
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Program Name: ERICCARD 



Identificatio n 

System 
Program Name 
Sponsor 
Programmer 
Installation 
Machine 
OP System 
Date Written 



ERIC Retrieval System 
ERICCARD 

Resource Information Center 
Lee Brueni 

U of North Dakota Computer Center 

IBM 360/40G 

OS/PCP 

December, 1971 



Function 



Read and parse each search request and construct descriptor 
records to be used in searching the usage master data set 
(inverted file), The table of descriptors is sorted into alphabetic 
order prior to being written to output set RICWORD. A brie 
summary of the number of searches processed and number of des- 
criptors processed is also generated. 



If an error is detected in a search request an error message is 
printed, and the search in error is deleted from the descriptor 

table. 



Program Options 
Pages ize 



Col. 1 - 8 
Col. 9 - 10 



Change the number of lines per page to 
a new value from the default of 
60. 

ipgggslzs 1 

NN a number specifying the new page 
size 



Entry Points 

ERICCARD 
EOF IN 



Entry Point to Program 

End of File Branch Point for RICIN 



v pros Used 




en 

ose 



Get 

Put 

Call 

Save 

Return 




Data Sets Used 




RICIN 


- This data set contains the control 
options to the program. Normally 
this Is the card reader. 


RICLIST 


- This data set contains the printed 
list of the search requests. 
Errors detected In the search 
requests, and a summary report 
Indicating the number of searches 
and descriptors processed, are 
provided. 


RZCWORO 


- This data set Is used to pass the 
sorted table of descriptors with 
logic, search number and word 
number to the .following programs 
in the retrieval system. The 
format of the records Is defined 
by -FMTW0RDS-. 



42 



Program Name: ERICUSE 



Identification 



System 
Program Name 
Sponsor 
Progranmer 
Installation 
Language 
OP System 
Machine 
Date Written 



ERIC Retrieval System 
ERICUSE 

Resource Information Center 
Lee Brueni 

U of North Dakota Computer Center 

Assembly F 

OS/PCP 

IBM 360/406. 

December, 1971 



Function 

Read the descriptor records created by the ERICCARD Program, 
then search for the descriptor in the usage master data set v also 
called the inverted file). When the descriptor is found, construct 
a record composed of Information from the descriptor record created 
by ERICCARD and the accession number, then write the record into 
the accumulation data set. 



Program Options 
Pagesize 



Change the number of lines per page to 
a new value from the default of 60. 



Col. 1 - 8 
Col. 9 - 10 



ACCTYPE 



Col. 1 - 8 
Col. 9 - 10 



‘Pagesize 1 

NN a number specifying the new page 
size 

Define the type of accession numbers 
that will be allowed to be proces- 
sed. If this option is not chosen 
the default will be_to allow 
processing of only ED type of 
accession numbers. Note that this 
option may be used more than once 
to chose more than one of the 
options. 

•ACCTYPE-' 

XXX where XXX is replaced by one of 
the following choices. 
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ED - ED type of accession numbers 
will be accepted. (RIE 
collection) 

EJ - EO type of accession numbers 
will be accepted. (OIOE 
collection) 

VT - VT type of accession numbers 
will be accepted. (Vocational 
Education) 

ALL - No check will be made as to 
type of accession numbers to 
be processed. 



Entry Points 

ERICUSE - Entry Point to Program 

EOFUSAGE - End of file branch point for RICUSAGE 

EOFWORDS - End of file branch point for RICWORDS 

EOFIN - End of file branch point for RICIN 



Macros Used 

Open 

Close 

Get 

Put 

Read 

Check 

Note 

Point 

Call 

Save 

Return 



Data Set? Used 

RICIN - This data set contains the control 

options to the program. Normally 
this is the card reader. 

RICLIST - This data set contains the printed 

list of descriptors searched for, 
the search number and word number 
of the descriptor plus a count of 
the number of accession 
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numbers found. If an error 
occurred in locating a descriptor 
it is written into this data set. 

A summary of the number of 
descriptors and accession numbers 
processed is also generated. 

This data set contains tne descriptors 
to be searched for plus the search 
number, word number, and logic. 

The format of the records is 
defined by the DSECT ‘FMTWQRDS'. 
This data set is created by the 

i ERICCARD Program. 

This data set is commonly referred 
to as the ‘inverted file 1 and 
contains the accession numbers 
listed under each descriptor. 

To enable processing a given 
descriptor more than once, data 
set positioning is used. To 
accomplish this the following I/O 
macros are used . . . Read, Check, 
Note, and Point. 

This data set is used for the 
accumulation of the hit records 
created from the merger of the 
accession numbers from 'RICUSE*. 
with the search number, word 
number, and logic operators for 
a given descriptor from ‘RICWORD 1 . 
The record layout is given by 
‘ FMTACCM ‘ . 



Program Name: ERICL06 



Identification 

System 

Program Name 

Sponsor 

Programmer 

Installation 

Machine 

OP System 

Date Written 



ERIC Retrieval System 
ERICLOG 

Resource Information Center 
Lee Brueni 

U of North Dakota Computer Center 

IBM 360/40G 

OS-PCP 

December, 1971 



Function 



Read the logic records from the RICACCM data set created by the 
r RICUSE Program and insert all logic records of the same search 
number and accession number into the logic table (LOGICTAB). When 
the logic table is complete, then- scan through the table performing 
the logic operations level by level, starting at the inner most, 
working outward. If a hit results then write it to the RICHOLD 
data set. 

Upon completion of processing of the RICACCM data set, the data 
set RICHOLD is processed as input creating the data set RI CHITS and 
printing the search hit report. The records written to RICH ITS 
contain the search number, accession number, plus current and 
historic counts which are used by the abstract printing program 
ERICABST. 



Program Options 
Pagesize 



Col. 1 - 8 
Col. 9 - 10 



Change the number of lines; per page 
to a new val ue from the defaul t 
of 60. 

‘Pagesize* 

NN a number specifying a new page 
size. 



Entry Points 



ERICLOGC 

EOFACCM 

EOFHOLD 

EOFIN 



- Entry point to program 

- End of file branch point for RICACCM 

- End of file branch point for RICHOLD 

- End of file branch point for RICIN 



Macros Used 

Open 

Close 

Get 

Put 

Call 

Save 

Return 



Data Sets Used 
RICIN 

RICLIST 



RICACCM 



RICHOLD 



RICHITS 



This data set contains the control 
options to the program. Normally 
this is the card reader. 

This data set contains the printed 
list of all hits plus a summary 
hits list. The normal output 
device is a line printer. 

This data set contains records 
composed of a search number, 
accession number, word number and 
logic which is to be operated on. 
The format of the records is 
given by FMTACCM. 

This data set is used to temporarily 
hold the hits records before 
writing them to the hit data set 
RICHITS. The format of the record 
is given by FMTHOLD. 

This data set contains the final hit 
records with the current and 
historic numbers assigned. The 
format of the record is given by 
access. 



Program Name: ERICABST 



Identification 



System 

Program Name 
Sponsor 
Programmer 
Installation 
Language 
OP System 
Machine 
Date Written 



ERIC Retrieval System 
ERICABST 

Resource Information Center 
Lee Brueni 

U of North Dakota Computer Center 

Assembly F 

OS/PCP 

IBM 360/406 

August, 1971 



Function 

This routine is used to retrieve and print the abstracts for the 
hits accumulated for each search. The maximum number of abstracts 
printed for a search can be restricted so as not to print more 
abstracts than needed. If the restriction feature is used, then an- 
other option exists, the printing of current or historic abstracts. 

For each search, control of what could be printed is governed by 
a control flag. If a specific control flag is not specified then 
the default control flag is used. The default control flag can be 
overridden for the entire run if desired. This allows for the 
maximum amount of flexability in the printing of the abstracts. 



Program Options 
Pages ize 



Col. 1 - 8 
Col. 9 - 10 



Tapesbeg 



Col. 1 - 8 
Col. 10 - 17 



Tapesend 



- Change the number of lines per page 

to a new value from the default 
of 60. 

- 'Pagesize* 

- NN a number specifying the new page 
size. 

- Define the accession number to be 

regarded as the first on the 
first tape. 

- 'Tapesbeg 1 

- XXXXXXXX The accession number (IE 

such as ED001001) 

- Define the accession number to be 

regarded as the last on a given 
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VESTAPE 



Col . 1-8 
Control Flag 



Col. 1 - 4 



Col. 5 - 12 



Col. 13 - 16 



Col. 17 - 39 



Col umn 

Col 17 
Col 18 
Col 19 
Col 20 
Col 21 
Col 22 
Col 23 
Col 24 
Col 25 
Col 26 
Col 27 
Col 28 
Col 29 



tape. Note that one 'tapesend 1 
card must be present for each 
tape to be processed, 

- Define that the Vocational Educa- 

tion tape will be processed. 

If not present the default will 
be processing of the ERIC tapes. 

- ‘VESTAPE 1 

- The printing option can be set for 

the entire run or for an indi- 
vidual search. The options are 
the type of search (historic or 
current) and print requests 
(suppression or request for a 
specific entry). 

- Search number. If blank the control 

flag's new setting pertains to 
the entire run. 

- Search type. ‘Current 1 or ‘Historic 

current is the default. 

- Abstract limit. The maximum number 

of abstracts to be printed per 
search. 

- Print Control Flag. AT indicates 

that the option will be set on, a 
‘0‘ indicates the option will be 
set off, and a ' ' indicates no 
change. These settings are 
position dependent and defined as 
follows: 



Entry 



Position Default 



Accession number 0 
Clearing house number 1 
Clearing house number 2 
Program area 3 
Publication date 4 
Title - 5 
Personal author 6 
Institution code 7 
Sponsoring agency code 8 
Descriptor 9 
Identifier 10 
EDRS price 11 
Descriptive note 12 



1 

0 

0 

0 

1 

1 

1 

0 

0 

1 

0 

1 

1 
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Column 


Entry 


Position 


Default 


Col 30 


Issue 


13 


0 


Col 31 


Abstract 


14 


1 


Col 32 


Report number 


15 


0 


Col 33 


Contract number 


16 


0 


Col 34 


Grant number 


17 


0 


Col 3b 


Bureau number 


18 


0 


Col 36 


Availability 


19 


1 


Col 37 


Journal citation 


20 


1 


Col 38 


Institution name 


21 


0 


Col 39 


Sponsoring agency name 


22 


0 


NOTE: 


Position refers to the bit position in the control 



flag 'control'. 



Entry Points 
ERICABST 
EOF IN 
EOFHITS 
EOFABST 



- Entry point to program. 

- End of file branch point of RICIN. 

- End of file branch point for RICHITS. 

- End of file branch point for RICABST. 



Macros Used 



Open 

Close 

Get 

Put 

Time 

Call 

Feov 

Save 

Return 



Data Sets Used 

RICIN - This data set contains the control 

options to the program. Normally 
this is the card reader. 

RICLIST - This data set contains the printed 

abstracts that were found for each 
search. Normally this is the 
printer. 
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RICHITS 



This data set is used to obtain the 
next ERIC accession number whose 
abstract is to be searched for. 

The format for a record in this 
data set is defined by the DSECT 
EDHIT. 

RICABST - This data set is used to search for 

the proper abstracts. The form 
of an entry in a record is in the 
following form ..1)2 BYTE 
length code - this is the total 
length of an entry, from the 
beginning of the length code to 
the end of the entries information. 
2) 2 BYTE entry code - this 
contains a code that identifies 
what type of an entry this is. 

These codes are defined in the 
VECTOR table 'ENTCODE 1 . 3) The 
information for this entry. 
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APPENDIX B 



ADDITIONAL HARDWARE CONFIGURATIONS AND COSTS FOR 
THE MINI-COMPUTER PROGRAM 

a. Mlnimual Stand Alone System 

« . 

This system would require a PDP-8 with 8k memory, a dual DEC 
tape unit, an IBM compatable magnetic tape drive, a line printer, a 
teletype and the prerequisites required by this hardware. The following 
chart shows the costs associated with this hardware configuration over 
a five year period. 

$44,000 Initial hardware costs. 

18.000 Maintenance contract for 57 months 

36.000 One shift operator ($600 per month) 

5,000 Overhead 

$103,000 Total 

x 46.5 Per cent of terminal time required to run searches (refer 
_____ to Chapter 4) 

$48,000 Total cost for computer time 

4.6,000 Number of searches over the five year period 

$ 8.00 Computer time cost per search 
+ .30 Cost for paper on which the resumes are printed 

$ 8.30 Total cost per search on a shared basis 

Assuming that a dedicated system was required, a high speed 
teletype unit could be used in place of the present teletype and line 
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printer. This would result in the following cost analysis: 
$103,000 Total cost of configuration a,. 

- 10,000 Reduction in cost when eliminating the line printer 



$ 93,000 Resultant total 

4 6,000 Number of searches over the five year period 
$ 15.50 Total cost per search on a dedicated basis. 
b. Ideal Stand-Alone System 

The addition of a random access disc to configuration a will 
result in considerable time savings. 

$103,000 Total cost of configuration a. 

7,000 Additional cost for disc with maintenance 

$110,000 Total 

x 38.2 Per cent of terminal time required to run searches (refer to 
Chapter 3} 

$ 42,000 Total cost for computer time 
4 6,000 Number of searches over the five year period 

$ 7.00 Computer time cost per search 

.30 Cost for paper on which the resumes are printed 

$ 7.30 Total cost per search on a shared basis 

Assuming a dedicated system was desired, the faster teletype 
as described in configuration a would be required. 
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$110,000 Total cost of configuration b 
- 10,000 Reduction in cost when eliminating the line printer 

$100,000 Resultant total 

* 6,000 Number of searches over the five year period 
$ 16.67 Total cost per search on a dedicated basis. 



c. Addition of Hardware to Minimual Existing System 

Assuming availability of a PDP-8 with 8k memory, dual DEC tape 
drives, and a teletype on which sufficient time can be obtained in 
return for additional hardware, the hardware required would be the 
line printer and magnetic tape drive. 

$ 22,500 Initial hardware costs 

8,500 Maintenance contract for 57 months 
24,000 Operator (600 batches x 8.0 hours/batch x $5/hou**) 

$ 55,000 Total cost for computer time 

a, 6,000 Number of searches over the five year period 

$ 9.20 Computer time cost per search 

+ .30 Cost for paper on which the resumes are printed 

$ 9.50 Total cost per search on a dedicated or shared basis 

Because the operator costs are dependent upon the amount of operating 
time, there would be no appreciable cost reduction by printing on a 
teletype or ether slow speed printer. 

d. Addition of Hardware to Ideal Existing System 

The Ideal system would ’.ave available a random access disk 
in addition to the hardware listed under configuration c. The only 
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difference in the cost analysis of c is the operator time as shown 
in the following table. 

$ 55,000 Total cost for computer time for configuration c. 

- 4,500 Reduction in operator cost (600 batches x 8.0 hours/batch 
for tape - 6.5 hours/batch for disk - x $5/hour). 

$ 50,500 Revised total cost for computer time 

6,000 Number of searches over the five year period 

$ 8.40 Computer time cost per search 

+ .30 Cost for paper on which the resumes are printed 

$ 8.70 Total cost per search on a dedicated or shared basis. 

It should be noted that the cost per search estimates for thi: and 
configuration c show the amortization of the hardware charged entirely 
to the computer search application. Assuming the amortization is 
spread over all applications that use the hardware and/or that either 
the line printer or tape drive is already available, a considerable 
reduction in the cost per search would result. 
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